Preceding Rule Induction with Instance Reduction Methods

نویسندگان

  • Osama Othman
  • Christopher H. Bryant
چکیده

A new prepruning technique for rule induction is presented which applies instance reduction before rule induction. An empirical evaluation records the predictive accuracy and size of rule-sets generated from 24 datasets from the UCI Machine Learning Repository. Three instance reduction algorithms (Edited Nearest Neighbour, AllKnn and DROP5) are compared. Each one is used to reduce the size of the training set, prior to inducing a set of rules using Clark and Boswell's modification of CN2. A hybrid instance reduction algorithm (comprised of AllKnn and DROP5) is also tested. For most of the datasets, pruning the training set using ENN, AllKnn or the hybrid significantly reduces the number of rules generated by CN2, without adversely affecting the predictive performance. The hybrid achieves the highest average predictive accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

IRDDS: Instance reduction based on Distance-based decision surface

In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...

متن کامل

Mining Soft-Matching Rules from Textual Data

Text mining concerns the discovery of knowledge from unstructured textual data. One important task is the discovery of rules that relate specific words and phrases. Although existing methods for this task learn traditional logical rules, soft-matching methods that utilize word-frequency information generally work better for textual data. This paper presents a rule induction system, TEXTRISE, th...

متن کامل

Noise-Tolerant Rule induction from Multi-Instance data

This paper addresses the issue of multipleinstance induction of rules in the presence of noise. It first proposes a multiple-instance extensions of rule-based learning algorithms. Then, it shows what kind of noise can appear in multiple-instance data, and how to handle it theoretically. Finally, it describes the implementation of such a noise-tolerant multiple instance learner, and shows its pe...

متن کامل

Unifying Instance - Based and Rule - Based Induction

Several well-developed approaches to inductive learning now exist, but each has speci c limitations that are hard to overcome. Multi-strategy learning attempts to tackle this problem by combining multiple methods in one algorithm. This article describes a uni cation of two widely-used empirical approaches: rule induction and instance-based learning. In the new algorithm, instances are treated a...

متن کامل

Rule Induction by EDA with Instance-Subpopulations

In this paper, a new rule induction method by using EDA with instance-subpopulations is proposed. The proposed method introduces a notion of instance-subpopulation, where a set of individuals matching a training instance. Then, EDA procedure is separately carried out for each instance-subpopulation. Individuals generated by each EDA procedure are merged to constitute the population at the next ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013